85 research outputs found

    Highly Efficient Twin Module Structure of 64-Bit Exponential Function Implemented on SGI RASC Platform

    Get PDF
    This paper presents an implementation of the double precision exponential function. A novel table-based architecture, together with short Taylor expansion, provides a low latency (30 clock cycles) which is comparable to 32 bit implementations. A low area consumption of a single exp() module (roughtly 4% of XC4LX200) allows that several modules can be implemented in a single FPGAs.The employment of massive parallelism results in high performance of the module. Nevertheless, because of the external memory interface limitation, only a twin module structure is presented in this paper. This implementation aims primarily to meet quantum chemistry huge and strict requirements for precision and speed. Each module is capable of processing at speed of 200MHz with max. error of 1 ulp, RMSE equals 0.6

    Weak RSA Key Discovery on GPGPU

    Get PDF
    We address one of the weaknesses of the RSA ciphering systems \textit{i.e.} the existence of the private keys that are relatively easy to compromise by the attacker. The problem can be mitigated by the Internet services providers, but it requires some computational effort. We propose the proof of concept of the GPGPU-accelerated system that can help detect and eliminate users' weak keys.We have proposed the algorithms and developed the GPU-optimised program code that is now publicly available and substantially outperforms the tested CPU processor. The source code of the OpenSSL library was adapted for GPGPU, and the resulting code can perform both on the GPU and CPU processors. Additionally, we present the solution how to map a triangular grid into the GPU rectangular grid \textendash{} the basic dilemma in many problems that concern pair-wise analysis for the set of elements. Also, the comparison of two data caching methods on GPGPU leads to the interesting general conclusions. We present the results of the experiments of the performance analysis of the selected algorithms for the various RSA key length, configurations of GPU grid, and size of the tested key set

    Implementation of a RANLUX Based Pseudo-Random Number Generator in FPGA Using VHDL and Impulse C

    Get PDF
    Monte Carlo simulations are widely used e.g. in the field of physics and molecular modelling. The main role played in these is by the high performance random number generators, such as RANLUX or MERSSENE TWISTER. In this paper the authors introduce the world's first implementation of the RANLUX algorithm on an FPGA platform for high performance computing purposes. A significant speed-up of one generator instance over 60 times, compared with a graphic card based solution, can be noticed. Comparisons with concurrent solutions were made and are also presented. The proposed solution has an extremely low power demand, consuming less than 2.5 Watts per RANLUX core, which makes it perfect for use in environment friendly and energy-efficient supercomputing solutions and embedded systems

    The Algorithms for FPGA Implementation of Sparse Matrices Multiplication

    Get PDF
    In comparison to dense matrices multiplication, sparse matrices multiplication real performance for CPU is roughly 5--100 times lower when expressed in GFLOPs. For sparse matrices, microprocessors spend most of the time on comparing matrices indices rather than performing floating-point multiply and add operations. For 16-bit integer operations, like indices comparisons, computational power of the FPGA significantly surpasses that of CPU. Consequently, this paper presents a novel theoretical study how matrices sparsity factor influences the indices comparison to floating-point operation workload ratio. As a result, a novel FPGAs architecture for sparse matrix-matrix multiplication is presented for which indices comparison and floating-point operations are separated. We also verified our idea in practice, and the initial implementations results are very promising. To further decrease hardware resources required by the floating-point multiplier, a reduced width multiplication is proposed in the case when IEEE-754 standard compliance is not required

    FPGA Implementation of Procedures for Video Quality Assessment

    Get PDF
    Video resolutions used in a variety of media are constantly rising. While manufacturers struggle to perfect their screens, it is also important to ensure high quality of displayed image. Overall quality can be measured using Mean Opinion Score (MOS). Video quality can be aected by miscellaneous artifacts, appearing at every stage of video creation and transmission. In this paper, we present a solution to calculate four distinct video quality metrics that can be applied to a real-time video quality assessment system. Our assessment module is capable of processing 8K resolution in real time set at the level of 30 frames per second. The throughput of 2.19 GB/s surpasses the performance of pure software solutions. The module was created using a high-level language to concentrate on the architectural optimization

    Difficulties in recognizing granulomatosis with polyangiitis (GPA) in elderly patients undergoing diagnostic thoracotomy twice — a report of two cases

    Get PDF
    Granulomatosis with polyangiitis (GPA) is defined as a necrotizing granulomatous inflammation usually involving the upper and lower respiratory tract with necrotizing vasculitis affecting predominantly small to medium vessels. Because of non-specific symptoms, its radiological presentation, and the diversity of its clinical expression, it is not uncommon to for it to be misdiagnosed, especially in the elderly. Although biopsy and histological examination seem to be essential for GPA diagnosis, their results are sometimes ambiguous and not helpful in making a decision. In this report, we present difficulties in the recognition of GPA in two elderly patients in whom, despite twice performing a diagnostic thoracotomy, GPA was recognized almost 4 and 6 years after the first symptoms

    ZMIENNOŚĆ DŁUGOŚCI ŁODYG ODMIAN WYKI SIEWNEJ (VICIA SATIVA L. SSP. SATIVA) O ZDETERMINOWANYM I NIEZDETERMINOWANYM WZROŚCIE I ICH WPŁYW NA NIEKTÓRE CECHY UŻYTKOWE

    Get PDF
    In the years 2001 and 2002, the study was conducted in six experiments in order to examine the conditioning of the length of stem variability and its impact on cropping features of determinate and indeterminate cultivars of common vetch. Rainfall in June and July as well as during the whole growing season was positively correlated with length of stem, but negatively correlated with seed yield, to a larger extent in the group of indeterminate cultivars than in the determinate one. Duration of blooming stage, length of stem, and seed yield showed the largest variability in both groups. Increase in length of stem of plants of indeterminate cultivars led to the delay in maturation, to less even maturation, and to the decrease in the thousand seed weight and seed yield. Increase in length of stem of plants of determinate cultivars delayed reaching the phase of technical maturation and decreased evenness of plant maturation. Determinate growth of common vetch did not lead to the reduction of lodging

    Parallelized Algorithms for Finding Similar Images and Object Recognition

    Get PDF
    The paper addresses the issue of searching for similar images and objects ina repository of information. The contained images are annotated with the helpof the sparse descriptors. In the presented research, different color and edgehistogram descriptors were used. To measure similarities among images, variouscolor descriptors are compared. For this purpose different distance measureswere employed. In order to decrease execution time, several code optimizationand parallelization methods are proposed. Results of these experiments, as wellas discussion of the advantages and limitations of different combinations ofmethods are presented
    corecore